Search CORE

4 research outputs found

Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

Author: Agrawal Vishwajeet
Bindal Vishal
Jain Rahul
Kalithasan Namasivayam
Paul Rohan
Singh Himanshu
Singla Parag
Tuli Arnav
Publication venue
Publication date: 10/03/2023
Field of study

Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1] (ii) infer action sequences from instructions but require dense sub-goal supervision [2] or (iii) lack semantics required for deeper object-centric reasoning inherent in interpreting complex instructions [3]. In contrast, our approach can handle linguistic as well as perceptual variations, end-to-end trainable and requires no intermediate supervision. The proposed model uses symbolic reasoning constructs that operate on a latent neural object-centric representation, allowing for deeper reasoning over the input scene. Central to our approach is a modular structure consisting of a hierarchical instruction parser and an action simulator to learn disentangled action representations. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps and scenes with different number of objects, demonstrate that our model is robust to such variations and significantly outperforms baselines, particularly in the generalization settings. The code, dataset and experiment videos are available at https://nsrmp.github.ioComment: International Conference on Robotics and Automation (ICRA), 202

arXiv.org e-Print Archive

Recommended from our members

Tracking what matters: A decision-variable account of human behavior in bandit tasks

Author: Agrawal Vishwajeet
Shenoy Pradeep
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

We study human learning & decision-making in tasks with probabilistic rewards. Recent studies in a 2-armed bandit task find that a modification of classical Q-learning algorithms, with outcome-dependent learning rates, better explains behavior compared to constant learning rates. We propose a simple alternative: humans directly track the decision variable underlying choice in the task. Under this policy learning perspective, asymmetric learning can be reinterpreted as an increasing confidence in the preferred choice. We provide specific update rules for incorporating partial feedback (outcomes on chosen arms) and complete feedback (outcome on chosen & unchosen arms), and show that our model consistently outperforms previously proposed models on a range of datasets. Our model and update rules also add nuance to previous findings of perseverative behavior in bandit tasks; we show evidence of outcome-dependent choice perseveration, i.e., that humans persevere in their choices unless contradictory evidence is presented

eScholarship - University of California